\subsection{Naive Bayesian}
In order to evaluate Naive Bayesian, we experimented with various combination of attributes. Following table details our results. C,A and U represent Churn, Appetency and Upselling respectively. The Top 5, Top 10 etc. rows in the table represent the top k attribtues choosen from gain ratio ordered attributes. 

%
\begin{table} 
\caption{Naive Bayesian Results}
\centering
\scalebox{0.8} {
\begin{tabular}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}

\hline

Attributes & No of Attributes & \multicolumn{3}{c|}{Accuracy(\%)} & \multicolumn{3}{c|}{Sensitivity} & \multicolumn{3}{c|}{Specificity} & \multicolumn{3}{c|}{Balanced Accuracy}\tabularnewline

\hline

\hline

& & C & A & U & C & A & U & C & A & U & C & A & U\tabularnewline

\hline

Categorical Only & 40 & 91.69 & 96.61 & 91.45 & 0.04 & 0.03 & 0.01 & 0.98 & 0.98 & 0.99 & 0.51 & 0.51 & 0.50 \tabularnewline

\hline

Top 5 & 5 & 89.30 & 98.22 & 91.23 & 0.06 & 0 & 0.06 & 0.96 & 0.99 & 0.96 & 0.51 & 0.50 & 0.51 \tabularnewline

\hline

Top 10 & 10 & 86.03 & 94.34 & 89.29 & 0.13 & 0.11 & 0.08 & 0.92 & 0.96 & 0.96 & 0.53 & 0.53 & 0.52 \tabularnewline

\hline

Top 15 & 15 & 86.90 & 92.10 & 89.14 & 0.13 & 0.18 & 0.08 & 0.93 & 0.93 & 0.96 & 0.53 & 0.56 & 0.52 \tabularnewline

\hline

Top 20 & 20 & 86.94 & 91.75 & 89.06 & 0.15 & 0.17 & 0.08 & 0.93 & 0.93 & 0.95 & 0.54 & 0.55 & 0.52 \tabularnewline

\hline

Top 25 & 25 & 86.88 & 93.40 & 88.71 & 0.15 & 0.15 & 0.09 & 0.93 & 0.95 & 0.95 & 0.54 & 0.55 & 0.52 \tabularnewline

\hline

Top 50 & 50 & 85.22 & 92.63 & 88.09 & 0.20 & 0.17 & 0.11 & 0.90 & 0.94 & 0.94 & 0.55 & 0.55 & 0.53 \tabularnewline

\hline

Top 100 & 100 & 83.54 & 92.28 & 86.88 & 0.23 & 0.18 & 0.13 & 0.88 & 0.94 & 0.93 & 0.56 & 0.56 & 0.53 \tabularnewline

\hline

Top 150 & 150 & 82.63 & 92.21 & 86.08 & 0.24 & 0.18 & 0.15 & 0.87 & 0.93 & 0.92 & 0.56 & 0.56 & 0.53 \tabularnewline

\hline

All attributes & 230 & 92.57 & 98.22 & 92.65 & 0.00 & 0.00 & 0.00 & 1 & 1 & 1.00 & 0.50 & 0.50 & 0.50 \tabularnewline

\hline

\end{tabular}
}
\end{table}

\begin{figure*}[t]
\begin{minipage}[t]{\columnwidth}
\centering
\psfig{figure=figures/BAC.pdf,width=5in,height=2.5in}
\caption{Balanced Accuracy v/s no of top gain ratio ranked attributes}
\label{fig:PCB}
\end{minipage}
\hfill
\end{figure*}

\begin{figure*}[h!t]
\begin{minipage}[t]{\columnwidth}
\centering
\psfig{figure=figures/Accuracy.pdf,width=5in,height=2.5in}
\caption{Absolute Accuracy v/s no of top gain ratio ranked attributes}
\label{fig:PCB}
\end{minipage}
\hfill
\end{figure*}

\noindent The table and the graph detail the relationship between accuracy and the balanced accuracy which was used in the competition. Including more and more attributes increases the balanced accuracy but decreases the overall accuracy. We believe this can be attributed to the fact that increasing the gain ratio attributes allows the model to predict the positive class samples accuractely, thus resulting in an increase in the sensitivity. But at the same time it results in a decrease in the overall accuracy because improved predictability of the positive class comes at the cost of misprediction of the negative class thus decreasin the overall accuracy. 

\noindent We also observe from the data above that a reversal occurs in the graph after including 150 attributes, which is attributed to the fact that as we start including the lower half of gainratio attributes, whose gain ratio is significant less as compared to the upper half we again start predicting a high number of instances as negative. Finally on including all the attributes, we predict everything as a negative sample thus dropping the sensitivity to 0. It should also be noted including all the attributes, includes the attributes with a gainratio of 0 (the last 23 attributes have a gain ratio of 0 for all attributes).

\begin{figure*}[h!t]
\begin{minipage}[t]{\columnwidth}
\centering
\psfig{figure=figures/scatter.pdf,width=3in,height=2in}
\caption{Scatter plot of Gainratio}
\label{fig:PCB}
\end{minipage}
\hfill
\end{figure*}
